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Nucleotide sequence for NTHi strain 11 hap gene (start codon to 
stop codon) : 

1 ATGAAAAAAA CTGTATTTCG TCTTAATTTT TTAACCGCTT GCATTTCATT 
51 AGGGATAGTA TCGCAAGCGT GGGCAGGTCA TACTTATTTT GGGATTGACT 
101 ACCAATATTA TCGTGATTTT GCCGAGAATG AAGGCAAGTT TGCAGTTGGG 
151 GCTAAAAATA TTGATGTTTA TAACAAAGAA GGGCAATTAG TTGGCACATC 

2 01 AATGACAAAA GCCCCGATGA TTGATTTCTC AGTCGTTTCC AGAAATGGAG 
251 TTGCTGCCTT AGTAGGCGAT CAGTATATTG TGAGTGTGGC ACATAATGTA 

3 01 GGCTATACCA ATGTGGATTT TGGTGCTGAA GGACAAAATC CTGATCAACA 
351 TCGTTTTACT TATAAAATTG TGAAACGGAA TAATTATAAT CACGATGCGA 
401 AGCACCGCTA TCTAGATGAC TACCATAATC CACGTTTACA TAAATTTGTA 
451 ACGGATGCGG CACCAATTGA TATGACTTCA CATATGGATG GCAATAAGTA 
501 TGCAAATAAG GAAAAATATC CTGAACGAGT ACGCGTCGGA TCTGGAGATC 
551 AGTATTGGGA TGACGAT CAA AACAACAGAA CTTATTTATC TGACGGATAT 
601 AATTATTTAA CAGGTGGGAA TACATATAAT CAAAGCGGTA GAGGTGATGG 
651 ATATTCATAT GTGAGAGGTG ATATTCGCAA AGTTGGCGAT TATGGTCCAT 
701 TACCGATTGC AAGTTCATTC GGGGACAGTG GATCTCCAAT GTTTATTTAT 
751 GATGCTGAAA CACAAAAATG gcTAATTAAT GGAGTATTGC GGGAGGGGCA 
801 ACCTTATACA GGCGAATTCG ATGGATTTCA ATTAGCCCGT AAATCTTTCC 
851 TTGATGAAAT TATACGCAAA GATCAACCAA ATGGTTTTTT AACCCCTAAG 
901 GGGAATGGCG TTTATACCAT TTCTAAAAGT GACGATGGGA TAGGAGTTGT 
951 TACTTCGAAA ATTGGAAAAC CTCGTGAAAT ACCTTTAGCG AACAACAAAT 

1001 TAAAAATAGA AGATAAAGAT ACTGTCTATA ATAACAGATA TAATGGT CCT 
1051 AATATTTATT CTCCTCAATT AAACAATGGC AAGAATATTT ATTTTGGAGA 
1101 TGAAGAATTA GGATCCATAA CTTTAACGAC TGATATCGAT CAAGGTGCAG 
1151 GCGGTTTGTA TTTTGAGGGG GATTTTATAG TTTCGCCTAC CAAAAATGAA 
1201 ACGTGGAAAG GTGCGGGCAT TCATGTCAGT GAAATTAGTA CCGTTACTTG 
1251 GAAAGTAAAC GGCGTGGAAA ATGATCGACT TTCTAAAATC GGTAAAGGAA 
1301 CATTACACGT TAAAGCCAAA GGGGAAAATA AAGGTTCGAT CAGCGTAGGC 
1351 GATGGTAAAG TCATTTTGGA GCAGCAGGCA GACGATCAAG GCAACAAACA 
1401 AGCCTTTAGT GAAATTGGCT TGGTTAGCGG CAGAGGGACT GTTCAATTAA 
1451 ACGATGATAA ACAATTTGAT ACCGATAAAT TTTATTTCGG CTTTCGTGGT 
1501 GGTCGCTTAG ATCTTAACGG ACATTCATTA ACCTTTAAAC GTATCCAAAA 
1551 TACGGACGAG GGGGCGATGA TTGTGAACCA TAATACAACT CAAGTCGCTA 
1601 ATATTACTAT TACTGGGAAC GAAAGTATTA CTGCTCCATC TAATAAAAAT 
1651 AATATTAATA AACTTGATTA CAGCAAAGAA ATTGCCTACA ACGGCTGGTT 
1701 TNGCGAAACA GATAAAAATA AACATAATGG ACGATTAAAC CTTATTTATA 
1751 AACCAACCAC AGAAGATCGT ACTTTGCTAC TTTCAGGCGG CACAAACTTA 
1801 AAAGGCGATA TTACTCAAAC AAAAGGTAAA CTATTTTTCA GCGGTAGACC 
1851 GACACCCCAC GCCTACAATC ATTTAGACAA ACGTTGGTCA GAAATGGAAG 
1901 GTATCCCACA AGGCGAAATT GTGTGGGATT ACGATTGGAT TAACCGCACA 
1951 TTTAAAGCTG AAAACTTCCA AATTAAAGGG GGAAGTGCGG TGGTTTCTCG 
2001 CAATGTTTCT TCAATTGAGG GAAATTGGAC AGTCAGCAAT AATGCAAATG 
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2051 CCACATTTGG TGTTGTGCCA AATCAGCAAA ATACCATTTG CACGCGTTCA 
2101 GATTGGACAG GATTAACGAC TTGTAAAACA GTTAATTTAA CCGATAAAAA 
2151 AGTTATTGAT TCCATACCGA CAACACAAAT TAATGGTTCT ATTAATTTAA 
2201 CTGATAATGC AACAGTGAAT ATTAATGGTT TAGCAAAACT TAATGGTAAT 
2251 GTCACTTTAA T AAAT CAT AG CCAATTTACA TTGAGCAACA ATGCCACCCA 
23 01 AATAGGCAAT ATCAAACTTT CAAATCACGC AAATGCAAGG GTAAATAATG 
2351 CCACTTTAAT GGGCGATGTG AATTTAGCGG ATACTAGCCG TTTTACATTA 
2401 AGCAATCAAG CAACACAGAT TGGCACAATC AGTCTTCATC AGCAAGCTCA 
2451 AGCAACAGTG GATAATGCAA ACTTGAACGG TAATGTGCAT TTAACGGATT 
2501 CTGCCAGATT TTCTTTAAAA AACAGTCATT TTTCGCACCA AATTCAGGGC 
2551 GACAAAGACA CAACAGTGAC GTTGGAAAAT GCGACTTGGA CAATGCCTAG 
2601 CGATACTACA TTGCAGAATT TAACGCTAAA TAATAGTACT GTTACGTTAA 
2651 ATTCAGCTTA TTCAGCTAGC TCAAATAATG CGCCACGTCG CCgCCGTTCA 
2701 TTAGAGACGG AAACAACGCC AACATCGGCA GAACATCGTT TCAACACATT 
2751 GACAGTAAAT GGTAAATTGA GCGGGCAAGG CACATTCCAA TTTACTCCAT 
2801 CTTTATTTGG CtATGAAAGC GATAAATTAA AATTATCCAA TGACGCTGAG 
2851 GGCGATTACA CATTATCTGT TCGCAACACA GGCAAAGAAC CCGTGACCCT 
2901 TGAGCAATTA ACTTTGGTTG AAAGCAAAGA TAATAAACCG TTATCAGACA 
2951 AACTCAAATT TACTTTAGAA AATGACCACG TTGATGCAGG TGCATTACGT 
3 0 01 TATAAATTAG TGAAGAATAA GGGCGAATTC CGCTTGCATA ACCCAATAAA 
3 051 AGAGCAGGAA TTGCGCTCTG ATTTAGTAAG AGCAGAGCAA GCAGAACGAA 
3101 CATTAGAAGC CAAACAAGTT GAACAGACTG CTGAAACACA AACAAGTAAT 
3151 GCAAGAGTGC GGTCAAGAAG AGCGGTGTTG TCTGATACCC CGTCTGCTCA 
3201 AAGC CTGTTA AACGCATTAG AAGTCAAACA AGCTGAACCG AATGCTAAAA 
3251 CACAAAAAAG TAAGGCAAAA ACAAAAAAAG CGCGGTCAAA AAGAGCATTG 
3301 AGAGAAGCGT TTTCTGATAC CCCGCCTGAT CTAAGCCAGT TAAACGTATT 
3351 AGAAGCCGCA CTTAAGGTTA TTAATGCCCA ACCGCAAACA GAAAAAGAAC 
3401 GTCAAGCTCA AGAGGAAGAA GCGAAAAGAC AACGCaAACA AAAAGACTTG 
3451 ATCAGCCGTT ACTCAAATAG TGCGTTATCG GAGTTGTCTG CAACAGTAAA 
3501 TAGTATGCTT TCCGTTCAAG ATGAATTGGA TCGTCTTTTT GTAGATCAAG 
3551 CACAATCTGC CCTGTGGACA AATATCGCAC AGGATAAAAG ACGCTATGAT 
3601 TCTGATGCGT TCCGTGCTTA TCAGCAGAAA ACGAACTTGC GTCAAATTGG 
3651 GGTGCAAAAA GCCTTAGATA ATGGACGAAT TGGGGCGGTT TTCTCGCATA 
3701 GCCGTTCAGA TAATACCTTT GACGAACAGG TTAAAAATCA CGCGACATTA 
3751 ACGATGATGT CGGGTTTTGC CCAATATCAA TGGGGCGATT TACAATTTGG 
3801 TGTAAACGTG GGCGCGGGAA TTAGTGCGAG TAAAATGGCT GAAGAACAAA 
3851 GCCGAAAAAT TCATCGAAAA GCGATAAATT ATGGTGTGAA TGCAAGTTAT 
3 901 CAGTTCCGTT TAGGGCAATT GGGTATTCAG CCTTATTTGG GTGTTAATCG 
3 951 ATATTTTATT GAACGTGAAA ATTATCAATC TGAAGAAGTG AAAGTGCAAA 
4001 CACCGAGCCT TGCATTTAAT CGCTATAATG CTGGCATTCG AGTTGATTAT 
4051 ACATTTACCC CGACAGATAA TATCAGCGTT AAGCCTTATT TCTTTGTCAA 
4101 TTATGTTGAT GTTTCAAACG CTAACGTACA AACCACTGTA AATAGCACGA 
4151 TGTTGCAACA ATCATTTGGG CGTTATTGGC AAAAAGAAGT GGGATTAAAG 
4201 GCAGAAATTT TACATTTCCA ACTTTCCGCT TTTATCTCAA AATCTCAAGG 
4251 TTCACAACTC GGTAAACAGC AAAATGTGGG CGTGAAATTG GGCTATCGTT 
4301 GGTAA 
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Amino acid sequence for NTHi strain 11 Hap protein (first amino 
acid to last amino acid) : 

1 MKKTVFRLNF LTACISLGIV SQAWAGHTYF GIDYQYYRDF AENEGKFAVG 

51 AKNIDVYNKE GQLVGTSMTK APMIDFSWS RNGVAALVGD QYIVSVAHNV 

101 GYTNVDFGAE GQNPDQHRFT YKIVKRNNYN HDAKHRYLDD YHNPRLHKFV 

151 TDAAPIDMTS HMDGNKYANK EKYPERVRVG SGDQYWDDDQ NNRTYLSDGY 

201 NYLTGGNTYN QSGRGDGYSY VRGD I RKVGD YGPLPIASSF GDSGSPMFIY 

251 DAETQKWLIN GVLREGQPYT GEFDGFQLAR KSFLDEIIRK DQPNGFLTPK 

301 GNGVYTISKS DDGIGWTSK IGKPREIPLA NNKLKIEDKD TVYNNRYNGP 

351 NIYSPQLNNG KNIYFGDEEL GS I TLTTD ID QGAGGLYFEG DFIVSPTKNE 

401 TWKGAG I HVS EISTVTWKVN GVENDRLSKI GKGTLHVKAK GENKGS I SVG 

451 DGKVILEQQA DDQGNKQAFS EIGLVSGRGT VQLNDDKQFD TDKFYFGFRG 

501 GRLDLNGHSL TFKRIQNTDE GAMIVNHNTT QVANITITGN ESITAPSNKN 

551 NINKLDYSKE IAYNGWFXET DKNKHNGRLN LIYKPTTEDR TLLLSGGTNL 

601 KGDITQTKGK LFFSGRPTPH AYNHLDKRWS EMEGIPQGEI VWDYDWINRT 

651 FKAENFQIKG GSAWSRNVS SIEGNWTVSN NANATFGWP NQQNTICTRS 

701 DWTGLTTCKT VNLTDKKVID SIPTTQINGS I NLTDNATVN INGLAKLNGN 

751 VTLINHSQFT LSNNATQIGN IKLSNHANAR VNNATLMGDV NLADTSRFTL 

801 SNQATQIGTI SLHQQAQATV DNANLNGNVH LTDSARFSLK NSHFSHQIQG 

851 DKDTTVTLEN ATWTMPSDTT LQNLTLNNST VTLNSAYSAS SNNAPRRRRS 

901 LETETTPTSA EHRFNTLTVN GKLSGQGTFQ FTPSLFGYES DKLKLSNDAE 

951 GDYTLSVRNT GKEPVTLEQL TLVESKDNKP LSDKLKFTLE NDHVDAGALR 

1001 YKLVKNKGEF RLHNPIKEQE LRSDLVRAEQ AERTLEAKQV EQTAETQTSN 

1051 ARVRSRRAVL SDTPSAQSLL NALEVKQAEP NAKTQKSKAK TKKARSKRAL 

1101 REAFSDTPPD LSQLNVLEAA LKVINAQPQT EKERQAQEEE AKRQRKQKDL 

1151 ISRYSNSALS EL S ATVNSML SVQDELDRLF VDQAQSALWT NIAQDKRRYD 

1201 SDAFRAYQQK TNLRQIGVQK ALDNGRI GAV FSHSRSDNTF DEQVKNHATL 

1251 TMMSGFAQYQ WGDLQFGVNV GAG I S AS KMA EEQSRKIHRK AINYGVNASY 

1301 QFRLGQLGIQ P YLGVNRYF I ERENYQSEEV KVQTPSLAFN RYNAGIRVDY 

1351 TFTPTDNISV KPYFFVNYVD VSNANVQTTV NSTMLQQSFG RYWQKEVGLK 

1401 AEILHFQLSA FISKSQGSQL GKQQNVGVKL GYRW 
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Nucleotide sequence for NTHi strain TN106 hap gene (start codon 
begins at position 422, stop codon begins at position 4595) : 

1 TGGCGGCGGA CAAATTATTG CGACGGGTAC AC CAGAACAA GTTGCTAAAG 
51 TAAAAAGTTC CCACACCGCT CGCTTCCTTA AACCGATTTT AGAAAAACCT 
101 TAGAAAAAAT GACCGCACTT TCAGAGAAAA CTCACATAAA GTGCGGTTAT 
151 TTTATTAGTG ATATTGTTTT AATTTTAGTT ATCTGTATAA ATTACATACA 
201 ATATTAATCC ATCGCAAGAT TAGATTACCC ACTAAGTATT AAGCAAAAAC 
251 CTAGAAATTT TGGCTTAATT ACTATATAGT TTTACTCATT TATTTTCTTT 
3 01 TGTGCCTTTT AGTTCATTTT TTTAGCTGAA AT CCCTTAGA AAATCACCGC 
351 ACTTTTATTG TT CAATAGTC GTTTAACCAC GTATTTTTTA ATACGAAAAA 
401 TTACTTAATT AAATAAACAT TATGAAAAAA ACTGTATTTC GTCTGAATTT 
451 TTTAACCGCT TGCATTTCAT TAGGGATAGT ATCGCAAGCG TGGGCAGGTC 
501 ATACTTATTT TGGGATTGAC TACCAATATT ATCGTGATTT TGCCGAGAAT 
551 AAAGGGAAGT TTACAGTTGG GGCTCAAGAT ATTGATATCT ACAATAAAAA 
601 AGGGGAAATG ATAGGTACGA TGATGAAAGG TGTGC CTATG CCTGATTTAT 
651 CTTCCATGGT TCGTGGTGGT TATTCAACAT TGATAAGTGA GCAGCATTTA 
701 ATTAGCGTCG CACATAATGT AGGGTATGAT GTCGTTGATT TTGGTATGGA 
751 GGGGGAAAAT CCAGACCAAC ATCGTTTTAA GTATAAAGTT GTTAAACGAT 
801 ATAATTATAA GAGCGGTGAT AGACAATATA ATGATTATCA ACATCCAAGA 
851 TTAGAGAAAT TTGTAACGGA AACTGCACCT ATTGAAATGG TTTCATATAT 
901 GGATGGTAAT CATTACAAAA ATTTTAATCA ATATCCTTTG CGAGTTAGAG 
'951 TTGGAAGTGG GCATCAATGG TGGAAAGACG ATAATAATAA AACCATTGGA 
1001 GACTTAGCCT ATGGAGGTTC ATGGTTAATA GGTGGAAATA CCTTTGAAGA 
1051 TGGACCAGCT GGTAACGGTA CATTAGAATT AAATGGGCGA GTACAAAATC 
1101 CTAATAAATA TGGTCCACTA CCTACGGCAG GTTCATTCGG GGATAGTGGT 
1151 TCTCCAATGT TTATTTATGA TAAGGAAGTT AAGAAATGGT TATTAAATGG 
1201 CGTGTTACGT GAAGGAAATC CTTATGCTGC AGTAGGAAAC AGCTATCAAA 
1251 TTACACGAAA AGATTATTTT CAAGGTATTC TTAATCAAGA CATTACAGCT 
13 01 AATTTTTGGG ATACTAATGC TGAATATAGA TTTAATATAG GGAGTGACCA 
1351 CAATGGAAGA GTGGCAACAA TCAAAAGTAC ATTACCTAAA AAAGCTATTC 
1401 AGCCTGAACG AATAGTGGGT CTTTATGATA ATAGCCAACT TCATGATGCT 
1451 AGAGATAAAA ATGGCGATGA ATCTCCCTCT TATAAAGGTC CTAATCCATG 
1501 GTCGCCAGCA TTACATCATG GGAAAAGTAT TTACTTTGGC GATCAAGGAA 
1551 CAGGAACTTT AACAATTGAA AATAATATAA ATCAAGGTGC AGGTGGATTG 
1601 TATTTTGAAG GTAATTTTGT TGTAAAAGGC AATCAAAATA ATATAACTTG 
1651 GCAAGGTGCA GGCGTTTCTG TTGGAGAAGA AAGTACTGTT GAATGGCAGG 
1701 TGCATAATCC AGAAGGCGAT CGCTTATCCA AAATTGGGCT GGGAACCTTA 
1751 CTTGTTAATG GTAAAGGGAA AAACTTAGGA AGCCTGAGTG TCGGTAACGG 
1801 TTTGGTTGTG TTAGATCAAC AAGCAGATGA ATCAGGT CAA AAACAAGCCT 
1851 TTAAAGAAGT TGGCATTGTA AGTGGTAGAG CTACCGTTCA ACTAAATAGT 
1901 GCAGATCAAG TTGATCCTAA CAATATTTAT TTCGGCTTTC GTGGTGGTCG 
1951 CTTAGATCTT AATGGGCATT CATTAACCTT TGAAGGTATC CAAAATACGG 
2001 ATGAAGGCGC GATGATTGTG AACCACAACG CTTCTCAAAC CGCAAATATT 
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2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 
3301 
3351 
3401 
3451 
3501 
3551 
3601 
3651 
3701 
3751 
3801 
3851 
3901 
3951 
4001 
4051 
4101 
4151 
4201 



ACGATTACAG 
TAAAAAAGAT 
AAACAAATGG 
CATTTGTTGC 
TGGTGGTACG 
ATTTAAGAAG 
GTGTGGGATC 
AATTAAAGGC 
GAAATTGGAC 
AATCAGCAAA 
TTGTAAAACA 
CAACACAAAT 
ATTCATGGTT 
CCAATTTACA 
CAAATCACGC 
AATTTAATGG 
CCAAATCCAA 
GGACAATGCC 
ACTGTTACGT 
CCGTCGCCGC 
ATCGTTTCAA 
TT CCAATTTA 
ATC CAATGAC 
AAGAACCCGT 
AAACCGTTAT 
TGCAGGTGCA 
TACATAACCC 
GAGCAAGCAG 
AACACAAACA 
ATCCCCTGCC 
CTGACTACTG 
AGCTGCGAGA 
CACTTGAGGT 
CAAGAGGAAG 
TTACTCAAAT 
TTTCCGTTCA 
GCCGTGTGGA 
GTTCCGTGCT 
AAGCCTTAGA 
GATAATACCT 
GTCGGGTTTT 
TGGGTGCGGG 
ATTCATCGAA 
TTTAGGGCAA 



GCAACGCAAC 
ATTGCATTTA 
TCGTTTAAAT 
TTTCTGGGGG 
TTAGTTTTTA 
AGACTTGTCT 
ACGATTGGAT 
GGAAGTGCGG 
AGTCAGCAAT 
ATACCATTTG 
GTTGATTTAA 
TAATGGTTCT 
TAGCAAAACT 
TTGAGCAACA 
AAATGCAACG 
ATTCTGCTCA 
GGTGGGGAAG 
TAGCGATACC 
TAAATTCAGC 
CGTTCATTAG 
CACATTGACA 
CTTCATCTTT 
GCTGAGGGCG 
GACCTTTGGG 
CAGACAAACT 
TTACGTTATA 
AATAAAAGAG 
AACGAACATT 
AGTAAGGCAA 
TGCTCAAAGC 
AAACACAAAC 
GAGTTTTCTG 
TATTGATGCC 
AAGAGAAAAG 
AGTGCGTTAT 
AGATGAATTG 
CAAAT AT CGC 
TATCAGCAGA 
TAATGGACGA 
TTGACGAACA 
GCCCAATATC 
AATTAGTGCG 
AAGCGATAAA 
TTGGGTATTC 



TATTAATTCA 
ACGGCTGGTT 
GTGAATTATC 
GACAAATTTA 
GTGGTCGTCC 
AACATGGAAG 
CAACCGCACA 
TGGTTTCTCG 
AATGCAAATG 
CACGCGTTCA 
C CGAT AAAAA 
ATTAATTTAA 
TAATGGTAAT 
ATGCCACCCA 
GTGGACAATG 
ATTTTCTTTA 
ACACAACAGT 
ACATTGCAGA 
TTATTCAGCT 
AGACGGAAAC 
GTAAATGGTA 
ATTTGGCTAT 
ATTACACATT 
CAATTAACTT 
CACATTCACG 
AATTAGTGAA 
CAGGAATTGC 
AGAAGCCAAA 
GAGTGCGGTC 
CTGTTAAAAG 
AAGTAAGGCA 
ATACCCTGCC 
CAACAGCAAG 
ACAACGCAAA 
CGGAGTTGTC 
GATCGTCTTT 
ACAGGATAAA 
AAACGAACTT 
ATTGGGGCGG 
GGTTAAAAAT 
AATGGGGCGA 
AGTAAAATGG 
TTATGGTGTG 
AGCCTTATTT 



GATAGCAAAC 
TGGTGAGCAA 
AACCAGTTAA 
AACGGCAATA 
AACGCCTCAT 
GTATCCCACA 
TTTAAAGCTG 
CAATGTTTCT 
CCACATTTGG 
GATTGGACAG 
AGTTATTAAT 
CTGATAATGC 
GT CACTTTAA 
AACAGGCAAT 
CAAATTTGAA 
AAAAACAGCC 
GATGTTGGAA 
ATTTAACGCT 
ATCTCAAATA 
AACGCCAACA 
AATTGAGCGG 
AAAAGCGATA 
ATCTGTTCGC 
TGGTTGAAAG 
TTAGAAAATG 
GAATGATGGC 
GCTCTGATTT 
CAAGTTGAAC 
AAGAAGAGCG 
CATTAGAAGC 
AAAAAAGTGC 
TGATCAAATA 
TGAAAAAAGA 
CAAAAAGAAT 
TGCGACAGTA 
TTGTAGATCA 
AGACGCTATG 
GCGTCAAATT 
TTTTCTCGCA 
CACGCGACAT 
TTTACAATTT 
CTGAAGAACA 
AATGCAAGTT 
GGGTGTTAAT 



AACTTACTAA 
GATAAAGCTA 
TGCAGAAAAT 
TCACGCAAAA 
GCTTACAAT C 
AGGCGAAATT 
AAAACTTCCA 
TCAATTGAGG 
TGTTGTGCGA 
GATTAACGAC 
TCCATACCGA 
AACAGTGAAT 
TAGATCACAG 
ATCAAACTTT 
CGGTAATGTG 
ATTTTTCGCA 
AATGCGACTT 
AAATAATAGT 
ATGCGC CACG 
TCGGCAGAAC 
GCAAGGCACA 
AATTAAAATT 
AACACAGGCA 
CAAAGATAAT 
ACCACGTTGA 
GAATTCCGCT 
AGTAAGAGCA 
AGACTGCTAA 
GTGTTTTCTG 



CAAACAAGCT 

GGTCAAAAAG 

TTACAAGCCG 

ACCTCAAACT 

TGATCAGCCG 

AATAGTATGC 

AGCACAATCT 

ATTCTGATGC 

GGGGTGCAAA 

TAGCCGTTCA 

TAGCGATGAT 

GGTGTAAACG 

AAGCCGAAAA 

ATCAGTTCCG 

CGATATTTTA 
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4251 TTGAACGTGA AAATTATCAA TCTGAAGAAG TGAAAGTGCA AACACCGAGC 
4301 CTTGTATTTA ATCGCTATAA TGCTGGCATT CGAGTTGATT ATACATTTAC 
4351 CCCGACAGAT AATATCAGCA TTAAGCCTTA TTTCTTCGTC AATTATGTTG 
4401 ATGTTTCAAA CGCTAACGTA CAAAC CACTG TAAATCGCAC GATGTTGCAA 
4451 CAATCATTTG GGCGTTATTG GCAAAAAGAA GTGGGATTAA AGGCAGAAAT 
4501 TTTACATTTC CAACTTTCCG CTTTTATCTC AAAATCTCAA GGTTCACAAC 
4551 TCGGCAAACA GCAAAATGTG t GGCGTGAAAT TGGGGTATCG TTGGTAAAAA 
4601 TCAAC 
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Amino acid sequence for NTHi strain TN106 Hap protein (first 
amino acid to last amino acid) : 

1 MKKTVFRLNF LTACISLGIV SQAWAGHTYF GIDYQYYRDF AENKGKFTVG 

51 AQDIDIYNKK GEM I GTMMKG VPMPDLSSMV RGGYSTLISE QHLISVAHNV 

101 GYDWD FGME GENPDQHRFK YKWKRYNYK SGDRQYNDYQ HPRLEKFVTE 

151 TAP I EMVS YM DGNHYKNFNQ YPLRVRVGSG HQWWKDDNNK TIGDLAYGGS 

201 WL I GGNTFED GPAGNGTLEL NGRVQNPNKY GPLPTAGSFG DSGS PMFIYD 

251 KEVKKWLLNG VLREGNPYAA VGNSYQITRK DYFQGILNQD ITANFWDTNA 

3 01 EYRFNIGSDH NGRVAT I KST LPKKAIQPER IVGLYDNSQL HDARDKNGDE 

351 SPSYKGPNPW S PALHHGKS I YFGDQGTGTL TIENNINQGA GGLYFEGNFV 

401 VKGNQNNITW QGAGVSVGEE STVEWQVHNP EGDRLSKIGL GTLLVNGKGK 

451 NLGSLSVGNG LWLDQQADE SGQKQAFKEV GIVSGRATVQ LNSADQVDPN 

501 NIYFGFRGGR LDLNGHSLTF ERIQNTDEGA MIVNHNASQT ANITITGNAT 

551 INSDSKQLTN KKDIAFNGWF GEQDKAKTNG RLNVNYQPVN AENHLLLSGG 

601 TNLNGNITQN GGTLVFSGRP TPHAYNHLRR DLSNMEGIPQ GEIVWDHDWI 

651 NRTFKAENFQ IKGGSAWSR NVSSIEGNWT VSNNANATFG WPNQQNTIC 

701 TRSDWTGLTT CKTVDLTDKK VINSIPTTQI NGS INLTDNA TVNIHGLAKL 

751 NGNVTLIDHS QFTLSNNATQ TGNIKLSNHA NATVDNANLN GNVNLMDSAQ 

801 FSLKNSHFSH QIQGGEDTTV MLENATWTMP SDTTLQNLTL NNSTVTLNSA 

851 YSAI SNNAPR RRRRSLETET TPTSAEHRFN TLTVNGKLSG QGTFQFTSSL 

901 FGYKSDKLKL SNDAEGDYTL S VRNTGKE P V TFGQLTLVES KDNKPLSDKL 

951 TFTLENDHVD AGALRYKLVK NDGEFRLHNP IKEQELRSDL VRAEQAERTL 

1001 EAKQVEQTAK TQTSKARVRS RRAVFSDPLP AQSLLKALEA KQALTTETQT 

1051 SKAKKVRSKR AAREFSDTLP DQILQAALEV IDAQQQVKKE PQTQEEEEKR 

1101 QRKQKELISR YSNSALSELS ATVNSMLSVQ DELDRLFVDQ AQSAVWTNIA 

1151 QDKRRYDSDA FRAYQQKTNL RQIGVQKALD NGRIGAVFSH SRSDNTFDEQ 

1201 VKNHATLAMM SGFAQYQWGD LQFGVNVGAG ISASKMAEEQ SRKIHRKAIN 

1251 YGVNASYQFR LGQLGIQPYL GVNRYFIERE NYQSEEVKVQ TPSLVFNRYN 

1301 AGI RVDYTFT PTDNISIKPY FFVNYVDVSN ANVQTTVNRT MLQQSFGRYW 

1351 QKEVGLKAEI LHFQLSAFIS KSQGSQLGKQ QNVGVKLGYR W 
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Nucleotide sequence for NTHi strain 860295 hap gene (start codon 
begins at position 430, stop codon begins at position 4738) : 

1 GGAGGCAGTG GTGGCGGACA AATTATTGCG ACGGGTACGC CAGAACAAGT 
51 TGCCAAAGTA GAAAGTTCCC ACACCGCCCG CTTCCTTAAA CCGATTTTAG 
101 AAAAACCTTA GAAAAAATGA CCGCACTTTC AGAGAAAACT CACATAAAGT 
151 GCGGTTATTT TATTAGTGAT ATTGTTTTAA TTTTAGTTAT CTGTATAAAT 
201 TACATATAAT ATTAATCCAT CGCAAGATAA GATTACCCAC TAAGTATTAA 
251 GCAAAAACCT AGAAATTTTG GCTTAATTAC TATATAGTTT TACTGCTTTA 
301 TTTTCTTTTG TGCCTTTTAG TTCGTTTTTT TAGCTGAAAT CCCTTAGAAA 
351 ATCACCGCAC TTTTATTGTT CAATAGTCGT TTAACCACGT ATTTTTTAAT 
4 01 ACGAAAAATT ACTTAATTAA ATAAACATTA TGAAAAAAAC TGTATTTCGT 
451 CTGAACTTTT TAACCGCTTG CATTTCATTA GGGATAGTAT CGCAAGCGTG 
501 GGCAGGTCAC ACTTATTTTG GGATTGACTA CCAATATTAT CGTGATTTTG 
551 CTGAGAATAA AGGGAAGTTT T CAGTTGGGG CTAAAAATAT TGAGGTTTAT 
601 AACAAAGAGG GGACTTTAGT TGGCACATCA ATGACAAAAG CCCCGATGAT 
651 TGATTTTTCT GTGGTGTCGC GAAATGGGGT GGCGGCATTA GTAGGCGATC 
701 AGTATATTGT GAGTGTGGCA CATAACGGTG GATATAATAG CGTTGATTTT 
751 GGAGCAGAAG GTCCAAATCC CGATCAGCAT CGTTTTACTT ATCAAATTGT 
801 AAAAAGAAAT AATTATAAGC CAGGCAAAGA TAACCCTTAT CATGGTGACT 
851 ATCACATGCC TCGTTTGCAC AAATTTGTCA CTGACGCTGA ACCAGCAAAG 
901 ATGACAGACA ATATGAATGG AAAGAACTAC GCTGATTTAA GTAAATATCC 
951 TGATCGTGTG CGTATTGGTA CAGGTGAACA ATGGTGGAGG ACTGATGAAG 
1001 AACAAAAGCA AGGAAGTAAG AGTTCATGGC TTGCTGATGC TTATCTGTGG 
1051 AGAATAGCAG GTAACACACA TTCACAAAGT GGAGCGGGCA ACGGCACGGT 
1101 AAACTTAAGT GGAGATATCA CAAAACCAAA- TAACTATGGA CCTCTTCCTA 
1151 CGGGTGTTTC GTTTGGAGAT AGTGGTTCTC CAATGTTTAT TTATGATGCA 
1201 ATAAAACAAA AATGGCTTAT TAATGGCGTA TTGCAAACTG GTAACCCTTT 
1251 CTCGGGAGCT GGAAATGGAT TCCAATTAAT TAGAAAAAAT TGGTTTTATG 
1301 ATAATGTCTT TGTAGAAGAT TTGCCTATAA CATTTTTAGA GCCAAGAAGT 
1351 AACGGTCATT ATTCATTTAC TTCAAATAAT AATGGAACTG GTACGGTTAC 
1401 TCAAACGAAT GAAAAAGTGA GTATGCCTCA ATTTAAAGTC AGAACGGTTC 
1451 AGTTATTTAA TGAAGCATTA AAAGAAAAAG ATAAAGAACC TGTTTATGCT 
1501 GCAGGTGGTG TAAATGCTTA TAAACCAAGA CTAAATAATG GTAAAAATAT 
1551 TTACTTTGGC GATCGAGGAA CAGGAACTTT AACAATTGAA AATAATATAA 
1601 ATCAAGGTGC TGGTGGTTTG TATTTTGAGG GTAACTTTAC GGTATCTTCA 
1651 GAAAATAATG CAACTTGGCA AGGTGCTGGA GTGCATGTAG GTGAAGACAG 
1701 TACTGTTACT TGGAAAGTAA ACGGCGTGGA ACATGATCGC CTTTCTAAAA 
1751 TTGGTAAAGG AACGTTGCAT ATTCAAGCAA AAGGTGAAAA CTTAGGCTCA 
1801 ATTAGCGTAG GTGACGGCAA AGTCATTTTA GATCAACAAG CCGATGAGAA 
1851 CAACCAAAAA CAAGCCTTTA AAGAAGTTGG CATTGTAAGT GGTAGAGCTA 
1901 CCGTTCAACT AAATAGTGCA GATCAAGTTG ATCCTAACAA TATTTATTTC 
1951 GGATTTCGTG GTGGTCGCTT AGATCTTAAC GGACATTCAT TAACCTTTAA 
2001 ACGTATCCAA AATACGGACG AGGGCGCGAT GATTGTGAAC CATAATACAA 
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2051 CTCAAGTCGC TAATATTACT ATTACTGGGA ACGAAAGTAT TACTGCTCCA 
2101 TCTAATAAAA ATAATATTAA TAAACTTGAT TACAGCAAAG AAATTGCTTA 
2151 CAACGGTTGG TTTGGCGAAA CAGATGAAAA TAAACACAAT GGAAGATTAA 
2201 ACCTTATTTA TAAACCAACC ACAGAAGATC GTACTTTGCT ACTTTCAGGT 
2251 GGAACAAATT TAAAAGGCAA TATTACTCAG GAAGGCGGCA CTTTAGTGTT 

23 01 TAGTGGTCGC CCAACTCCAC ACGCTTACAA TCATTTAAAT CGCCCAAACG 
2351 AGCTTGGGCG ACCTCAAGGC GAAGTGGTTA TTGATGACGA TTGGATCACC 

24 01 CGCACATTTA AAGCTGAAAA CTTCCAAATT AAAGGCGGAA GTGCGGTGGT 
2451 TTCTCGCAAT GTTTCTTCAA TTGAGGGAAA TTGGACAGTC AGCAATAATG 
2501 CAAATGCCGC ATTTGGTGTT GTGCCAAATC AGCAAAATAC CATTTGCACG 
2551 CGTTCAGATT GGACAGGATT AACGACTTGT AAAACTGTGG ATTTAACCGA 
2601 TACAAAAGTT ATTAATTCCA TACCGACAAC ACAAATTAAT GGCTCTATTA 
2 651 ATTTAACTGA TAATGCAACA GTGAATATTC ATGGTTTAGC AAAACTTAAT 
2 7 01 GGTAATGTCA CTTTAATAAA TCATAGCCAA TTTACATTGA GCAACAATGC 
2751 CACCCAAACA GGCAATATCC AACTTTCAAA TCACGCAAAT GCAACGGTGG 
28 01 ACAATGCAAA TTTGAACGGT AATGTGCATT TAACGGATTC TGCTCAATTT 
2 8 51 TCTTTAAAAA ACAGCCATTT TTCGCACCAA ATTCAGGGCG ACAAAGACAC 
2 901 AACAGTGACG TTGGAAAATG CGACTTGGAC AATGCCTAGC GATGCCACAT 
2951 TGCAGAATTT AACGCTAAAT AATAGTACTG TTACGTTAAA TTCAGCTTAT 
3001 TCAGCTAGCT CAAATAATGC GCCACGTCAC CGCCGTTCAT TAGAGACGGA 
3051 AACAACGCCA ACATCGGCAG AACAT CGTTT CAACACATTG ACAGTAAATG 
3101 GTAAATTGAG CGGGCAAGGC ACATTCCAAT TTACTTCATC TTTATTTGGC 
3151 TATAAAAGCG ATAAATTAAA ATTATCCAAT GACGCTGAGG GCGATTACAC 
3201 ATTATCTGTT CGCAACACAG GCAAAGAACC CGAAGCCCTT GAGCAATTAA 
3251 CTTTGGTTGA AAGCAAAGAT AATAAACCGT TAT CAGACAA ACTCAAATTT 
3301 ACTTTAGAAA ATGACCACGT TGATGCAGGT GCATTACGTT ATAAATTAGT 
3351 GAAGAATAAT GGCGAATTCC GCTTGCATAA CCCAATAAAA GAGCAGGAAT 
3401 TGCGCAATGA TTTAGTAAGA GCAGAGCAAG CAGAACGAAC ATTAGAAGCC 
3451 AAACAAGTTG AACAGACTGC TGAAACACAA ACAAGTAATG CAAGAGTGCG 
3501 GTCAAAAAGA GCGGTGTTTT CTGATACCCT GCCTGATCAA AGCCAGTTAG 
3551 ACGTATTACA AGCCGAACAA GTTGAACCGA CTGCTGAAAA ACAAAAAAAT 
3601 AAGGCAAAAA AAGTGCGGTC AAAAAGAGCG GTGTTTTCTG ATACCCTGCC 
3651 TGATCAAAGC CAGTTAGACG TATTACAAGC CGAACAAGTT GAACCGACTG 
3701 CTGAAAAACA AAAAAATAAG GCAAAAAAAG TGCGGTCAAA AAGAGCCGCG 
3751 AGAGAGTTTT CTGATACCCC GCTTGATCTA AGCCGGTTAA AGGTATTAGA 
3801 AGTCAAACTT GAGGTTATTA ATGCCCAACA GCAAGTGAAA AAAGAACCTC 
3851 AAGATCAAGA GAAACAACGC AAACAAAAAG ACTTGATCAG CCGTTATTCA 
3901 AATAGTGCGT TATCAGAATT ATCTGCAACA GTAAATAGTA TGCTTTCTGT 
3951 TCAAGATGAA TTAGATCGTC TTTTTGTAGA TCAAGCACAA TCTGCCGTGT 
4001 GGACAAATAT CGCACAGGAT AAAAGACGCT ATGATTCTGA TGCGTTCCGT 
4051 GCTTATCAGC AGAAAACGAA CTTACGTCAA ATTGGGGTGC AAAAAGCCTT 
4101 AGCTAATGGA CGAATTGGGG CAGTTTTCTC GCATAGCCGT T CAGAT AAT A 
4151 CTTTTGATGA ACAGGTTAAA AAT CACGCGA CATTAACGAT GATGTCGGGT 
4201 TTTGCCCAAT ATCAATGGGG CGATTTACAA TTTGGTGTAA ACGTGGGAAC 
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4251 GGGAATCAGT GCGAGTAAAA TGGCTGAAGA ACAAAGCCGA AAAATTCATC 

43 01 GAAAAGCGAT AAATTATGGC GTGAATGCAA GTTATCAGTT CCGTTTAGGG 

4351 CAATTGGGCA TTCAGCCTTA TTTTGGAGTT AATCGCTATT TTATTGAACG 

4401 TGAAAATTAT CAATCTGAGG AAGTGAAAGT GAAAACGCCT AGCCTTGCAT 

4451 TTAATCGCTA TAATGCTGGC ATTCGAGTTG ATTATACATT TACTCCGACA 

4501 GATAATATCA GCGTTAAGCC TTATTTCTTC GTCAATTATG TTGATGTTTC 

4 551 AAACGCTAAC GTACAAACCA CGGTAAATAG CACGGTGTTG CAACAACCAT 

4 601 TTGGACGTTA TTGGCAAAAA GAAGTGGGAT TAAAAGCGGA AATTTTACAT 

4 651 TTCCAACTTT CTGCTTTTAT TTCTAAATCT CAAGGTT CGC AACTCGGCAA 

47 01 ACAGCAAAAT GTGGGCGTGA AATTGGGGTA TCGTTGGTAA AAATCAACAT 
4751 AATTGTATCG TTTATTGATA AACAAGGTGG GGCAGATCCC ACCTTTTTTA 

48 01 TTTCAATAAT GGAACTTTAT TTAATTAAGA GCATCTAAGT AGCACCCCAT 
4851 ATAGGGGATT AATTAAGAGG ATTTAATAAT GAATTTAACT AAACTTTTAC 

49 01 CAGCATTTGC TGCTGCAGTC GTATTATCTG CTTGTGCAAA GGATGCACCT 
4951 GAAATGACAA AATCATCTGC GCAAATAGCT GAAATGCAAA CACTTCCAAC 

50 01 AATCACTGAT AAAACAGTTG TATATTCGTG CAATAAACAA ACTGTAACTG 
5051 CCGTGTATCA ATTTGAAAAC CAAGAACCAG TTGCTGCAAT GGTAAGTGTG 
5101 GGCGATGGCA TTATTGCGAA AGATTTTACT CGTGATAAAT CACAAAATGA 
5151 CTTTACAAGT TTCGTTTCTG GGGATTATGT TTGGAATGTA GATAGTGGCT 
5201 TAACGTTAGA TAAATTTGAT TCTGTTGTGC CTGTCAATTT AATTC 
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Amino acid sequence for NTHi strain 8602 95 Hap protein (first 
amino acid to last amino acid) : 

1 MKKTVFRLNF LTACISLGIV SQAWAGHTYF GIDYQYYRDF AENKGKFSVG 

51 AKNIEVYNKE GTLVGTSMTK APMIDFSWS RNGVAALVGD QYIVSVAHNG 

101 GYNSVDFGAE GPNPDQHRFT YQIVKRNNYK PGKDNPYHGD YHMPRLHKFV 

151 TDAE PAKMTD NMNGKNYADL SKYPDRVRIG TGEQWWRTDE EQKQGSKSSW 

201 LADAYLWR I A GNTHSQSGAG NGTVNLSGDI TKPNNYGPLP TGVSFGDSGS 

251 PMFIYDAIKQ KWLINGVLQT GNPFSGAGNG FQLIRKNWFY DNVFVEDL P I 

3 01 TFLEPRSNGH YSFTSNNNGT GTVTQTNEKV SMPQFKVRTV QLFNEALKEK 

351 DKE PVYAAGG VNAYKPRLNN GKNIYFGDRG TGTLTIENNI NQGAGGLYFE 

401 GNFTVSSENN ATWQGAGVHV GEDSTVTWKV NGVEHDRLSK IGKGTLHJQA 

451 KGENLGS I S V GDGKVILDQQ ADENNQKQAF KEVGIVSGRA TVQLNSADQV 

501 DPNNIYFGFR GGRLDLNGHS LTFKRIQNTD EGAMIVNHNT TQVANI T I TG 

551 NESITAPSNK NNINKLDYSK E I AYNGWFGE TDENKHNGRL NLIYKPTTED 

601 RTLLLSGGTN LKGNITQEGG TLVFSGRPTP HAYNHLNRPN ELGRPQGEW 

651 IDDDWITRTF KAENFQIKGG SAWSRNVSS IEGNWTVSNN ANAAFGWPN 

701 QQNTICTRSD WTGLTTCKTV DLTDTKVINS IPTTQINGSI NLTDNATVNI 

751 HGLAKLNGNV TLINHSQFTL SNNATQTGNI QLSNHANATV DNANLNGNVH 

801 LTDSAQFSLK NSHFSHQIQG DKDTTVTLEN ATWTMPS DAT LQNLTLNNST 

851 VTLNSAYSAS SNNAPRHRRS LETETTPTSA EHRFNTLTVN GKLSGQGTFQ 

901 FTSSLFGYKS DKLKLSNDAE GDYTLSVRNT GKE PEALEQL TLVES KDNKP 

951 LSDKLKFTLE NDHVDAGALR YKLVKNNGEF RLHNPIKEQE LRNDLVRAEQ 

1001 AERTLEAKQV EQTAETQTSN ARVRSKRAVF SDTLPDQSQL DVLQAEQVEP 

1051 TAEKQKNKAK KVRSKRAVFS DTLPDQSQLD VLQAEQVEPT AEKQKNKAKK 

1101 VRSKRAAREF SDTPLDLSRL KVLEVKLEVI NAQQQVKKEP QDQEKQRKQK 

1151 DLISRYSNSA LSELSATVNS MLSVQDELDR LFVDQAQSAV WTNIAQDKRR 

1201 YDSDAFRAYQ QKTNLRQIGV QKALANGRIG AVFSHSRSDN TFDEQVKNHA 

1251 TLTMMSGFAQ YQWGDLQFGV NVGTGISASK MAEEQSRKIH RKAINYGVNA 

1301 SYQFRLGQLG IQPYFGVNRY FIERENYQSE EVKVKTPSLA FNRYNAGIRV 

1351 DYTFTPTDNI SVKPYFFVNY VDVSNANVQT TVNSTVLQQP FGRYWQKEVG 

1401 LKAEILHFQL SAFISKSQGS QLGKQQNVGV KLGYRW 
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Nucleotide sequence for NTHi strain 3219B hap gene (start codon 
begins at position 388, stop codon begins at position 4561) : 



1 CCTGAAGACG TTGCTCAAGT TAAAGGCTCT CACACAGCCC GATTCCTTAA 
51 ACCGATTTTA GAAAAACCTT AGAAAAAATG ACCGCACTTT CAGAGAAAAC 
101 TCACATAAAG TGCGGTTATT TTATTAGTGA TATTGTTTTA ATTATTTGTA 
151 TAAATTACAT ACAATATTAA TCCATCGAAA AATAAGATTA CCCACTAAGT 

2 01 ATTAAGCCAA AACCTAGAAA TTTTGGCTTA ATTACTATAT AATTTTACTC 
251 CTTTATTTTC TTTTGTGCCT TTTAGTTAGT TCGTTTTTTA GCTGAAATCC 

3 01 CTCAGAAAAT CACCGCACTT TTATTGTTCA ATAGTCGTTT AACCACGTAT 

3 51 TTTTTAATAC GAAAAATTAC TTAATTAAAT AAACATTATG AAAAAAACTG 

4 01 TATTTCGTCT TAATTTTCTA ACCGCTTGTA TTTCATTAGG GATAGTATCG 

4 51 CAAGCGTGGG CAGGTCACAC TTATTTTGGG ATTGACTACC AATATTATCG 

5 01 TGATTTTGCC GAGAATAAAG GGAAGTTTAC AGTTGGGGCT CAAGATATTG 
551 ATATCTACAA TAAAAAAGGG GAAATGATAG GTACGATGAT GAAAGGTGTG 
601 CCTATGCCTG ATTTATCTTC CATGGTTCGT GGTGGTTATT CAACATTGAT 
651 AAGTGAGCAG CATTTAATTA GCGTCGCACA TAATGTAGGG TATGATGTCG 
701 TTGATTTTGG TATGGAGGGG GAAAATCCAG ACCAACATCG TTTTAAGTAT 
751 AAAGTTGTTA AACGATATAA TTATAAGAGC GGTGATAGAC AATATAATGA 
801 TTAT CAACAT CCAAGATTAG AGAAATTTGT AACGGAAACT GCACCTATTG 
851 AAATGGTTTC ATATATGGAT GGTAATCATT ACAAAAATTT TAATCAATAT 
901 CCTTTGCGAG TTAGAGTTGG AAGTGGGCAT CAATGGTGGA AAGACGATAA 
951 TAATAAAACC ATTGGAGACT TAGCCTATGG AGGTTCATGG TTAATAGGTG 

1001 GAAATACCTT TGAAGATGGA CCAGCTGGTA ACGGTACATT AGAATTAAAT 
1051 GGGCGAGTAC AAAATCCTAA TAAATATGGT CCACTACCTA CGGCAGGTTC 
1101 ATTCGGGGAT AGTGGTTCTC CAATGTTTAT TTATGATAAG GAAGTTAAGA 
1151 AATGGTTATT AAATGGCGTG TTACGTGAAG GAAATCCTTA TGCTGCAGTA 
1201 GGAAACAGCT ATCAAATTAC ACGAAAAGAT TATTTTCAAG GTATTCTTAA 
1251 TCAAGACATT ACAGCTAATT TTTGGGATAC TAATGCTGAA TATAGATTTA 
1301 ATATAGGGAG TGACCACAAT GGAAGAGTGG CAACAATCAA AAGTACATTA 
1351 CCTAAAAAAG CTATTCAGCC TGAACGAATA GTGGGTCTTT ATGATAATAG 
1401 CCAACTTCAT GATGCTAGAG ATAAAAATGG CGATGAATCT CCCTCTTATA 
1451 AAGGTCCTAA TCCATGGTCG CCAGCATTAC ATCATGGGAA AAGTATTTAC 
1501 TTTGGCGATC AAGGAACAGG AACTTTAACA ATTGAAAATA AT AT AAAT CA 
1551 AGGTGCAGGT GGATTGTATT TTGAAGGTAA TTTTGTTGTA AAAGGCAATC 
1601 AAAATAATAT AACTTGGCAA GGTGCAGGCG TTTCTGTTGG AGAAGAAAGT 
1651 ACTGTTGAAT GGCAGGTGCA TAATCCAGAA GGCGATCGCT TATCCAAAAT 
1701 TGGGCTGGGA ACCTTACTTG TTAATGGTAA AGGGAAAAAC TTAGGAAGCC 
1751 TGAGTGTCGG TAACGGTTTG GTTGTGTTAG ATCAACAAGC AGATGAATCA 
1801 GGTCAAAAAC AAGCCTTTAA AGAAGTTGGC ATTGTAAGTG GTAGAGCTAC 
1851 CGTTCAACTA AATAGTGCAG ATCAAGTTGA TCCTAACAAT ATTTATTTCG 
1901 GCTTTCGTGG TGGT CGCTTA GATCTTAATG GGCATTCATT AACCTTTGAA 
1951 CGTATCCAAA ATACGGATGA AGGCGCGATG ATTGTGAACC ACAACGCTTC 
2001 TCAAACCGCA AATATTACGA TTACAGGCAA CGCAACTATT AATTCAGATA 
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2051 GCAAACAACT TACTAATAAA AAAGATATTG CATTTAACGG CTGGTTTGGT 
2101 GAGCAAGATA AAGCTAAAAC AAATGGTCGT TTAAATGTGA ATTATCAACC 
2151 AGTTAATGCA GAAAATCATT TGTTGCTTTC TGGGGGGACA AATTTAAACG 
2201 GCAATATCAC GCAAAATGGT GGTACGTTAG TTTTTAGTGG TCGTCCAACG 
2251 CCTCATGCTT ACAATCATTT AAGAAGAGAC TTGTCTAACA TGGAAGGTAT 
23 01 CCCACAAGGC GAAATTGTGT GGGATCACGA TTGGATCAAC CGCACATTTA 
2351 AAGCTGAAAA CTTCCAAATT AAAGGCGGAA GTGCGGTGGT TTCTCGCAAT 
2401 GTTTCTTCAA TTGAGGGAAA TTGGACAGTC AGCAATAATG CAAATGCCAC 
2451 ATTTGGTGTT GTGCCAAATC AGCAAAATAC CATTTGCACG CGTTCAGATT 
2501 GGACAGGATT AACGACTTGT AAAACAGTTG ATTTAACCGA TAAAAAAGTT 
2551 ATTAATTCCA TACCGACAAC ACAAATTAAT GGTTCTATTA ATTTAACTGA 
2601 TAATGCAACA GTGAATATTC ATGGTTTAGC AAAACTTAAT GGTAATGTCA 
2651 CTTTAATAGA TCACAGCCAA TTTACATTGA GCAACAATGC CACCCAAGCA 
2701 GGCAATATCA AACTTTCAAA TCACGCAAAT GCAACGGTGG ACAATGCAAA 
2751 TTTGAACGGT AATGTGAATT TAATGGATTC TGCTCAATTT TCTTTAAAAA 
2801 ACAGC CATTT TTCGCACCAA ATCCAAGGTG GGGAAGACAC AACAGTGATG 
2851 TTGGAAAATG CGACTTGGAC AATGCCTAGC GATACCACAT TGCAGAATTT 
2901 AACGCTAAAT AATAGTACTG TTACGTTAAA TTCAGCTTAT TCAGCTATCT 

2 951 CAAATAATGC GCCACGCCGT CGCCGCCGTT CATTAGAGAC GGAAACAACG 

3 001 CCAACATCGG CAGAACATCG TTTCAACACA TTGACAGTAA ATGGTAAATT 
3 051 GAGCGGGCAA GGCACATTCC AATTTACTTC ATCTTTATTT GGCTATAAAA 
3101 GCGATAAATT AAAATTATCC AATGACGCTG AGGGCGATTA CACATTATCT 
3151 GTTCGCAACA CAGGCAAAGA ACCCGTGACC TTTGGGCAAT TAACTTTGGT 
3201 TGAAAGCAAA GATAATAAAC CGTTATCAGA CAAACTCACA TTCACGTTAG 
3251 AAAATGACCA CGTTGATGCA GGTGCATTAC GTTATAAATT AGTGAAGAAT 
3301 GATGGCGAAT TCCGCTTACA TAACCCAATA AAAGAGCAGG AATTGCGCTC 
3351 TGATTTAGTA AGAGCAGAGC AAGCAGAACG AACATTAGAA GCCAAACAAG 
3401 TTGAACAGAC TGCTAAAACA CAAACAAGTA AGGCAAGAGT GCGGTCAAGA 
3451 AGAGCGGTGT TTTCTGATCC CCTGCCTGCT CAAAGCCTGT TAAACGCATT 
3501 AGAAGCCAAA CAAGCTCTGA CTACTGAAAC ACAAACAAGT AAGGCAAAAA 
3551 AAGTGCGGTG AAAAAGAGCT GCGAGAGAGT TTTCTGATAC CCTGCCTGAT 
3601 CAAATATTAC AAGCCGCACT TGAGGTTATT GATGCCCAAC AGCAAGTGAA 
3651 AAAAGAACCT CAAACTCAAG AGGAAGAAGA GAAAAGACAA CGCAAACAAA 
3701 AAGAATTGAT CAGCCGTTAC TCAAATAGTG CGTTATCGGA GTTGTCTGCG 
3751 ACAGTAAATA GTATGCTTTC CGTTCAAGAT GAATTGGATC GTCTTTTTGT 
3801 AGATCAAGCA CAATCTGCCG TGTGGACAAA TATCGCACAG GATAAAAGAC 
3851 GCTATGATTC TGATGCGTTC CGTGCTTATC AGCAGAAAAC GAACTTGCGT 
3 901 CAAATTGGGG TGCAAAAAGC CTTAGATAAT GGACGAATTG GGGCGGTTTT 
3951 CTCGCATAGC CGTTCAGATA ATACCTTTGA CGAACAGGTT AAAAATCACG 
4001 CGACATTAGC GATGATGTCT GGTTTTGCCC AATATCAATG GGGCGATTTA 
4051 CAATTTGGTG TAAACGTGGG TGCGGGAATT AGTGCGAGTA AAATGGCTGA 
4101 AGAACAAAGC CGAAAAATTC ATCGAAAAGC GATAAATTAT GGTGTGAATG 
4151 CAAGTTATCA GTTCCGTTTA GGGCAATTGG GTATTCAGCC TTATTTGGGT 
4201 GTTAATCGAT ATTTTATTGA ACGTGAAAAT TATCAATCTG AAGAAGTGAA 
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4251 AGTGCAAACA CCGAGCCTTG TATTTAATCG CTATAATGCT GGCATTCGAG 

43 01 TTGATTATAC ATTTACCCCG ACAGATAATA TCAGCATTAA GCCTTATTTC 

43 51 TTCGTCAATT ATGTTGATGT TTCAAACGCT AACGTACAAA CCACTGTAAA 

44 01 TCGCACGATG TTGCAACAAT CATTTGGGCG TTATTGGCAA AAAGAAGTGG 
4451 GATTAAAGGC AGAAATTTTA CATTTCCAAC TTTCCGCTTT TAT CT CAAAA 
4501 TCTCAAGGTT CACAACTCGG CAAACAGCAA AATGTGGGCG TGAAATTGGG 
4551 GTATCGTTGG TAAAAATCAA CATAATTTTA TCGTTTATTG ATAAACAAGG 
4601 TGGGGCAGAT CAAATCCTAC CTTTTTTATT CCAATAATGG AACTTTATTT 
4651 TATTAAAGGT ATCTAAGTAG CACCCTATAT AGGGATTAAT TAAGAGGATT 
4701 TAATAATGAA TTTAACTAAA ATTTTACCCA CATTTGCTGC TGTAGT CGTA 
4751 TTATCTGCTT GTGCAAAGGA TGCACCTGAA ATGACAAAAT CATCTGCGCA 
48 01 AATAGCTGAA ATGCAAACAC TT 

FIG..22C 

Amino acid sequence for NTHi strain 3219B Hap protein (first 
amino acid to last amino acid) : 

1 MKKTVFRLNF LTACISLGIV SQAWAGHTYF GIDYQYYRDF AENKGKFTVG 

51 AQDIDIYNKK GEM I GTMMKG VPMPDLSSMV RGGYSTLISE QHLISVAHNV 

101 GYDWDFGME GENPDQHRFK YKWKRYNYK SGDRQYNDYQ HPRLEKFVTE 

151 TAP I EMVS YM DGNHYKNFNQ YPLRVRVGSG HQWWKDDNNK TIGDLAYGGS 

201 WLIGGNTFED GPAGNGTLEL NGRVQNPNKY GPLPTAGSFG DSGSPMFIYD 

251 KEVKKWLLNG VLREGNPYAA VGNSYQITRK DYFQGILNQD ITANFWDTNA 

3 01 EYRFNIGSDH NGRVAT I KS T LPKKAIQPER IVGLYDNSQL HDARDKNGDE 
351 SPSYKGPNPW S PALHHGKS I YFGDQGTGTL TIENNINQGA GGLYFEGNFV 

4 01 VKGNQNNITW QGAGVSVGEE STVEWQVHNP EGDRLSKIGL GTLLVNGKGK 
451 NLGSLSVGNG LWLDQQADE SGQKQAFKEV GIVSGRATVQ LNSADQVDPN 
501 NIYFGFRGGR LDLNGHSLTF ERIQNTDEGA MIVNHNASQT ANITITGNAT 
551 INSDSKQLTN KKDIAFNGWF GEQDKAKTNG RLNVNYQPVN AENHLLLSGG 
601 TNLNGNITQN GGTLVFSGRP TPHAYNHLRR DLSNMEGIPQ GEIVWDHDWI 
651 NRTFKAENFQ IKGGSAWSR NVSSIEGNWT VSNNANATFG WPNQQNTIC 
701 TRSDWTGLTT CKTVDLTDKK VINSIPTTQI NGS INLTDNA TVNIHGLAKL 
751 NGNVTL I DHS QFTLSNNATQ AGNI KLSNHA NATVDNANLN GNVNLMDSAQ 
801 FSLKNSHFSH QIQGGEDTTV MLENATWTMP SDTTLQNLTL NNSTVTLNSA 
851 YSAI SNNAPR RRRRSLETET TPTSAEHRFN TLTVNGKLSG QGTFQFTSSL 
901 FGYKSDKLKL SNDAEGDYTL SVRNTGKEPV TFGQLTLVES KDNKPLSDKL 
951 TFTLENDHVD AGALRYKLVK NDGE FRLHNP IKEQELRSDL VRAEQAERTL 

1001 EAKQVEQTAK TQTSKARVRS RRAVFSDPLP AQSLLNALEA KQALTTETQT 

1051 SKAKKVRSKR AAREFSDTLP DQILQAALEV IDAQQQVKKE PQTQEEEEKR 

1101 QRKQKELISR YSNSALSELS ATVNSMLSVQ DELDRLFVDQ AQSAVWTNIA 

1151 QDKRRYDSDA FRAYQQKTNL RQIGVQKALD NGRIGAVFSH SRSDNTFDEQ 

1201 VKNHATLAMM SGFAQYQWGD LQFGVNVGAG ISASKMAEEQ SRKIHRKAIN 

1251 YGVNASYQFR LGQLGIQPYL GVNRYFIERE NYQSEEVKVQ TPSLVFNRYN 

1301 AGIRVDYTFT PTDNISIKPY FFVNYVDVSN ANVQTTVNRT MLQQSFGRYW 

1351 QKEVGLKAEI LHFQLSAFIS KSQGSQLGKQ QNVGVKLGYR W 
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Nucleotide sequence for NTHi strain 13 96B hap gene (start codon 
begins at position 313, stop codon begins at position 4546): 

1 TGACCGCACT TTCAGAGAAA ACTCACATAA AGTGCGGTTA TTTTATTAGT 

51 GATATTGTTT TAATTTTAGT TATCTGTATA AATTACATAC AATATTAATC 

101 CATCGCAAGA TAAGATTACC CACTAAGTAT TAAGCAAAAA CCTAGAAATT 

151 TTGGCTTAAT TACTATATAG TTTTACT CAT TTATTTTCTT TTGTGC CTTT 

201 TAGTTCGTTT TTTTAGCTGA AATCCCTTAG AAAATCACCG CACTTTTATT 

251 GTTCAATAGT CGTTTAACCA CGTATTTTTT AATACGAAAA ATTACTTAAT 

3 01 TAAATAAACA TTATGAAAAA AACTGTATTT CGTCTGAATT TTTTAACCGC 

3 51 TTGCATTT CA TTAGGGATAG TATCGCAAGC GTGGGCAGGT CATACTTATT 

401 TTGGGATTGA CTAC CAATAT TATCGTGATT TTGCCGAGAA TAAAGGGAAG 

451 TTCACAGTTG GGGCTAAAAA TATTGAGGTT TACAATAAAA ATGGAAATTT 

501 AGTTGGCACA TCAATGACAA AAGCCCCAAT GATTGATTTT TCCGTGGTGT 

551 CGCGAAATGG GGTGGCGGCA TTGGTGGGCG ATCAGTATAT TGTGAGTGTG 

601 GCACATAATG TAGGCTATAC CAATGTGGAT TTTGGTGCTG AAGGACAAAA 

651 TCCTGATCAA CATCGTTTTA CTTATAAAAT TGTGAAACGG AATAATTATA 

701 AAAACGATCA AACGCATCCT TATGAGAAAG ACTACCACAA CCCACGCTTA 

751 CATAAATTTG TTACGGAAGC CACCCCAATC GATATGACTT CTGATATGAA 

801 CGGCAACAAA TATACAGATA GGACGAAATA TCCCGAACGC GTGCGTATCG 

851 GCTCCGGGTG GGAGTTTTGG CGAAACGATC AAAACAACGG CGACCAAGTT 

901 GCCGGCGCAT ATCATTACCT GACAGCAGGC AATACACACA ACCAAGGCGG 

951 AGCAGGGGGC GGCTGGTCAA GTCTGAGCGG CGATGTGCGG CAAGCGGGCA 

1001 ATTACGGCCC CATTCCTATT GCAGGCTCAA GCGGCGACAG CGGTTCGCCT 

1051 ATGTTTATTT ATGATGCGGA AAAACAAAAA TGGTTGATTA ACGGCGTATT 

1101 GAGGACCGGC AACCCTTGGG CGGGGACAGA GAATACATTC CAACTGGTAC 

1151 GCAAGTCTTT TTTTGATGAA ATCCTTGAAA AAGATTTGCG TACATCGTTT 

1201 TATAGCCCAT CGGGCAATGG TGCATACACC ATTACAGACA AAGGCGACGG 

1251 CAGCGGCATT GTCAAACAAC AAACAGGAAG ACCATCTGAA GTCCGCATCG 

13 01 GTTTAAAAGA CGACAAATTA CCTGCCGAAG GTAAAGACGA TGTTTACCAA 

1351 TACCAAGGTC CAAATATATA CCTGCCTCGT TTGAATAACG GTGGAAACCT 

1401 GTATTTCGGA GATCAAAAAA ACGGCACTGT TACCTTATCA ACCAACATCA 

1451 ACCAAGGTGC GGGCGGTTTG TATTTTGAGG GTAACTTTAC GGTATCTTCA 

1501 GAAAATAATG CAACTTGGCA AGGTGCTGGA GTGCATGTAG GTGAAGACAG 

1551 TACTGTTACT TGGAAAGTAA ATGGTGTTGA AAATGATCGC CTTTCTAAAA 

1601 TCGGCAAAGG CACATTGCAC GTTAAAGCCA AAGGGGAAAA TAAAGGTTCG 

1651 AT CAG CGTAG GCGATGGTAA AGTCATTTTG GAGCAGCAGG CAGACGATCA 

1701 AGGCAACAAA CAAGCCTTTA GTGAAATTGG CTTGGTTAGT GGCAGAGGTA 

1751 CGGTTCAGTT AAACGATGAC AAGCAATTTA ATACTGATAA ATTTTATTTC 

1801 GGCTTCCGTG GTGGTCGCTT AGATCTTAAT GGGCATTCAT TAACCTTTAA 

1851 ACGTATCCAA AATACGGATG AGGGAGCAAC GATTGTTAAT CACAATGCCA 

1901 CAACAGAATC TACAGTGACC ATTACTGGCA GCGATACCAT TAATGACAAC 

1951 ACTGGCGATT TAACCAATAA ACGTGATATT GCTTTTAATG GTTGGTTTGG 

2 001 TGATAAAGAT GATACTAAAA ATACTGGACG TTTGAATGTT ACTTACAATC 
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2051 CGCTTAACAA AGATAATCAC TTCCTTCTAT CAGGTGGAAC AAATTTAAAA 
2101 GGCAATATTA CTCAAGACGG TGGCACTTTA GTGTTTAGTG GTCGCCCAAC 

2151 ACCACACGCA TACAATCATT TAAATCGCCT AAACGAGCTT GGGCGACCTA 

2201 AGGGCGAAGT GGTTATTGAT GACGATTGGA TCAACCGTAC ATTTAAAGCT 

2251 GAAAACTTCC AAATTAAAGG CGGAAGTACG GTGGTTTCTC GCAATGTTTC 

23 01 TTCAATTGAA GGAAATTGGA CAATCAGCAA TAACGCCAAC GCGACATTTG 

23 51 GTGTTGTGCC AAATCAACAA AATAC CATTT GCACGCGTTC AGATTGGACA 

2401 GGATTAACGA CTTGTAAAAC AGTTAATTTA AC CGATAAAA AAGTTATTGA 

2451 TTCCATACCG ACAACACAAA TTAATGGCTC TATTAATTTA ACTAATAATG 

2501 CAACAGTGAA TATT CATGGT TTAGCAAAAC TTAATGGTAA TGTCACTTTA 

2551 ATAAATCATA GCCAATTTAC ATTGAGCAAC AATGCCACCC AAACAGGCAA 

2601 TAT CCAACTT TCAAAT CACG CAAATGCAAC GGTGGATAAT GCAAACTTGA 

2651 ACGGTAATGT GCATTTAACG GATTCTGCTC AATTTTCTTT AAAAAACAGC 

2701 CATTTTTCGC ACCAAATTCA GGGCGACAAA GACACAACAG TGACGTTGGA 

2751 AAATGCGACT TGGACAATGC CTAGCGATAC TACATTGCAG AATTTAACGC 

2801 TAAATAATAG TACTGTTACG TTAAATTCAG CTTATTCAGC TAGCTCAAAT 

2851 AATGCGCCAC GTCACCGCCG TTCATTAGAG ACGGAAACAA CGCCAACATC 

2 901 GGAAGAACAT CGTTTCAACA CATTGACAGT AAATGGTAAA TTGAGCGGGC 
2951 AAGGCACATT CCAATTTACT TCATCTTTAT TTGGCTATAA AAGCGATAAA 

3 001 ATAAAATTAT CTAATGACGC TGAAGGCGAT TACACATTAG CTGTTCGCGA 
3 051 CACAGGCAAA GAAC CTGTGA CCCTTGAGCA ATTAACTTTA ATTGAAGGCT 
3101 TGGATAATCA ACCCTTGCCA GATAAGCTAA AAATTACTTT AAAAAATAAA 
3151 CACGTTGATG CGGGTGCATG GCGTTATGAA TTAGTGAAGA AAAACGGCGA 

32 01 ATTCCGCTTG CAT AAT C CAA TAAAAGAGCA GGAATTGCGC AATGATTTAG 
3251 TAAAAGCAGA GCAAGTAGAA CGAGCATTAG AAGCAAAACA AGCTGAACTG 

33 01 ACTACTAAAA AACAAAAAAC TGAGGCTAAA GTGCGGTCAA AAAGAGCGGC 
33 51 GTTTTCTGAT ACCCCGCCTG ATCAAAGCCA GTTAAACGCA TTACAAGCCG 
3401 AACTCGAGAC GATTAATGCC CAACAGCAAG TGGCACAAGC GGTGCAAAAT 
3451 CAGAAAGTAA CTGCACTTAA CCAAAAGAAC GAGCAAGTTA AAACCACTCA 
3501 AGATAAAGCA AATTTAGTCT TGGCAACTGC ATTGGTGGAA AAAGAAACCG 
3551 CTCAGATTGA TTTTGCTAAT GCAAAATTAG CTCAGTTGAA TTTAACACAA 
3601 CAACTAGAAA AAGCCTTAGC AGTGGCTGAG CAAGCAGAAA AAGAGCGTAA 
3651 AGCTCAAGAG CAAGCGAAAA GACAACGCAA ACAAAAAGAC TTGATCAGCC 
3701 GTTATTCAAA TAGTGCGTTA TCAGAATTAT CTGCAACAGT AAATAGTATG 
3751 CTTTCCGTTC AAGATGAATT AGATCGTCTT TTTGTAGATC AAGCTCAATC 
3801 TGCGGTGTGG ACAAATATCT CACAGGATAA AAGACGTTAT GATTCTGATG 
3851 CGTTCCGTGC TTATCAGCAG AAAACGAACT TGCGTCAAAT TGGGGTGCAA 
3901 AAAGCCTTAG CTAACGGACG AATTGGGGCA GTTTTCTCGC ATAGCCGTTC 

3 951 AGATAATACT TTTGATGAAC AGGTTAAAAA TCACGCAACA TTAACGATGA 
4001 TGTCGGGTTT TGCCCAATAT CAATGGGGTG ATTTACAATT TGGTGTAAAC 

4 051 GTGGGAACGG GAATTAGTGC GAGTAAAATG GCTGAAGAAC AAAGCCGAAA 
4101 AATTCATCGA AAAGCGATAA ATTATGGCGT GAATGCAAGT TATTCGTTCC 
4151 ATTTAGGGCA ATTGGGTATT CAGCCTTATT TTGGAGTTAA TCGCTATTTT 
42 01 ATTGAACGTA AAAATTATCA ATCTGAGGAA GTGAAAGTGC AAACACCGAG 



FIG..24B 



A-59941-4 



45/45 

4251 CCTTGCATTT AATCGCTATA ATGCTGGAGT ACGGGTCGAT TATACGTTTA 

4301 CCCCGACAGA GAATATCAGC GTTAAGCCTT ATTTCTTCGT CAATTATGTT 

4351 GATGTTTCAA ACGCTAACGT ACAAACCACT GTAAATCGCG CGGTGTTGCA 

4401 ACAACCATTT GGACGTTATT GGCAAAAAGA AGTGGGATTA AAAGCGGAAA 

4451 TTTTACATTT CCAACTTTCT GCTTTTATTT CTAAATCTCA AGGTTCGCAA 

4501 CTCGGTAAAC AGCGAAATAT GGGCGTGAAA TTAGGATATC GTTGGTAAAA 

4551 ATCAACATAA TTTTATTCTA ATAATGGAAC TTTATTTAAT TAAAAGTATC 

4601 TAAGTAGCAC CCTATAGGGG ATTAATTAAG AGGATTTAAT AATGAATTTA 

4651 ACTAAAATTT TACCCGCATT TGCTGCTGCA GTCGTATTAT CTGCTTGTGC 

4701 AAAGGATGCA CCTGAAATGA CAAAATCATC TGCGCAAATA GCTGAAATGC 

4751 AAACACTTCC AACAATCACT GATAAAACAG TTGTATATTC TTGCAATAAA 

4801 CAAACTGTGA CTGCAGTGTA TCAATTTG 

FIG..24C 

Amino acid sequence for NTHi strain 13 96B Hap protein (first 
amino acid to last amino acid) : 

1 MKKTVFRLNF LTACISLGIV SQAWAGHTYF GIDYQYYRDF AENKGKFTVG 

51 AKNIEVYNKN GNLVGTSMTK APMIDFSWS RNGVAALVGD QYIVSVAHNV 

101 GYTNVDFGAE GQNPDQHRFT YKIVKRNNYK NDQTHPYEKD YHNPRLHKFV 

151 TEATPIDMTS DMNGNKYTDR TKYPERVRIG SGWQFWRNDQ NNGDQVAGAY 

201 HYLTAGNTHN QGGAGGGWSS LSGDVRQAGN YGPIPIAGSS GDSGS PMFIY 

251 DAEKQKWLIN GVLRTGNPWA GTENTFQLVR KSFFDEILEK DLRTSFYSPS 

301 GNGAYT I TDK GDGSGIVKQQ TGRPSEVRIG LKDDKLPAEG KDDVYQYQGP 

351 NIYLPRLNNG GNLYFGDQKN GTVTLSTNIN QGAGGLYFEG NFTVSSENNA 

401 TWQGAGVHVG EDSTVTWKVN GVENDRLSKI GKGTLHVKAK GENKGS I SVG 

451 DGKVILEQQA DDQGNKQAFS EIGLVSGRGT VQLNDDKQFN TDKFYFGFRG 

501 GRLDLNGHSL TFKRIQNTDE GATIVNHNAT TESTVTITGS DTINDNTGDL 

551 TNKRDIAFNG WFGDKDDTKN TGRLNVTYNP LNKDNHFLLS GGTNLKGNIT 

601 QDGGTLVFSG RPTPHAYNHL NRLNELGRPK GEWIDDDWI NRTFKAENFQ 

651 IKGGSTWSR NVSSIEGNWT I SNNANATFG WPNQQNTIC TRSDWTGLTT 

701 CKTVNLTDKK VIDSIPTTQI NGS INLTNNA TVNIHGLAKL NGNVTL INHS 

751 QFTLSNNATQ TGNIQLSNHA NATVDNANLN GNVHLTDSAQ FSLKNSHFSH 

801 QIQGDKDTTV TLENATWTMP SDTTLQNLTL NNS TVTLNS A YSASSNNAPR 

851 HRRSLETETT PTSEEHRFNT LTVNGKLSGQ GTFQFTSSLF GYKSDKIKLS 

901 NDAEGDYTLA VRDTGKEPVT LEQLTLIEGL DNQPLPDKLK ITLKNKHVDA 

951 GAWRYELVKK NGEFRLHNPI KEQELRNDLV KAEQVERALE AKQAELTTKK 

1001 QKTEAKVRSK RAAFSDTPPD QSQLNALQAE LETINAQQQV AQAVQNQKVT 

1051 ALNQKNEQVK TTQDKANLVL ATALVEKETA QIDFANAKLA QLNLTQQLEK 

1101 ALAVAEQAEK ERKAQEQAKR QRKQKDLISR YSNSALSELS ATVNSMLSVQ 

1151 DELDRLFVDQ AQSAVWTNIS QDKRRYDSDA FRAYQQKTNL RQIGVQKALA 

1201 NGRIGAVFSH SRSDNTFDEQ VKNHATLTMM SGFAQYQWGD LQFGVNVGTG 

1251 ISASKMAEEQ SRKIHRKAIN YGVNASYSFH LGQLGIQPYF GVNRYFIERK 

13 01 NYQSEEVKVQ TPSLAFNRYN AGVRVDYTFT PTENISVKPY FFVNYVDVSN 

13 51 ANVQTTVNRA VLQQPFGRYW QKEVGLKAEI LHFQLSAFIS KSQGSQLGKQ 

14 01 RNMGVKLGYR W 
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